exponential smoothing
Joint Prediction Regions for time-series models
Machine Learning algorithms are notorious for providing point predictions but not prediction intervals. There are many applications where one requires confidence in predictions and prediction intervals. Stringing together, these intervals give rise to joint prediction regions with the desired significance level. It is an easy task to compute Joint Prediction regions (JPR) when the data is IID. However, the task becomes overly difficult when JPR is needed for time series because of the dependence between the observations. This project aims to implement Wolf and Wunderli's method for constructing JPRs and compare it with other methods (e.g. NP heuristic, Joint Marginals). The method under study is based on bootstrapping and is applied to different datasets (Min Temp, Sunspots), using different predictors (e.g. ARIMA and LSTM). One challenge of applying the method under study is to derive prediction standard errors for models, it cannot be obtained analytically. A novel method to estimate prediction standard error for different predictors is also devised. Finally, the method is applied to a synthetic dataset to find empirical averages and empirical widths and the results from the Wolf and Wunderli paper are consolidated. The experimental results show a narrowing of width with strong predictors like neural nets, widening of width with increasing forecast horizon H and decreasing significance level alpha, controlling the width with parameter k in K-FWE, and loss of information using Joint Marginals.
Incorporating Exponential Smoothing into MLP: A Simple but Effective Sequence Model
Modeling long-range dependencies in sequential data is a crucial step in sequence learning. A recently developed model, the Structured State Space (S4), demonstrated significant effectiveness in modeling long-range sequences. However, It is unclear whether the success of S4 can be attributed to its intricate parameterization and HiPPO initialization or simply due to State Space Models (SSMs). To further investigate the potential of the deep SSMs, we start with exponential smoothing (ETS), a simple SSM, and propose a stacked architecture by directly incorporating it into an element-wise MLP. We augment simple ETS with additional parameters and complex field to reduce the inductive bias. Despite increasing less than 1\% of parameters of element-wise MLP, our models achieve comparable results to S4 on the LRA benchmark.
A Novel Theoretical Framework for Exponential Smoothing
Bernardi, Enrico, Lanconelli, Alberto, Lauria, Christopher S. A.
Simple Exponential Smoothing is a classical technique used for smoothing time series data by assigning exponentially decreasing weights to past observations through a recursive equation; it is sometimes presented as a rule of thumb procedure. We introduce a novel theoretical perspective where the recursive equation that defines simple exponential smoothing occurs naturally as a stochastic gradient ascent scheme to optimize a sequence of Gaussian log-likelihood functions. Under this lens of analysis, our main theorem shows that -- in a general setting -- simple exponential smoothing converges to a neighborhood of the trend of a trend-stationary stochastic process. This offers a novel theoretical assurance that the exponential smoothing procedure yields reliable estimators of the underlying trend shedding light on long-standing observations in the literature regarding the robustness of simple exponential smoothing.
Exponential Smoothing for Off-Policy Learning
Aouali, Imad, Brunel, Victor-Emmanuel, Rohde, David, Korba, Anna
Off-policy learning (OPL) aims at finding improved policies from logged bandit data, often by minimizing the inverse propensity scoring (IPS) estimator of the risk. In this work, we investigate a smooth regularization for IPS, for which we derive a two-sided PAC-Bayes generalization bound. The bound is tractable, scalable, interpretable and provides learning certificates. In particular, it is also valid for standard IPS without making the assumption that the importance weights are bounded. We demonstrate the relevance of our approach and its favorable performance through a set of learning tasks. Since our bound holds for standard IPS, we are able to provide insight into when regularizing IPS is useful. Namely, we identify cases where regularization might not be needed. This goes against the belief that, in practice, clipped IPS often enjoys favorable performance than standard IPS in OPL.
Data-driven Real-time Short-term Prediction of Air Quality: Comparison of ES, ARIMA, and LSTM
Talamanova, Iryna, Pllana, Sabri
Air pollution is a worldwide issue that affects the lives of many people in urban areas. It is considered that the air pollution may lead to heart and lung diseases. A careful and timely forecast of the air quality could help to reduce the exposure risk for affected people. In this paper, we use a data-driven approach to predict air quality based on historical data. We compare three popular methods for time series prediction: Exponential Smoothing (ES), Auto-Regressive Integrated Moving Average (ARIMA) and Long short-term memory (LSTM). Considering prediction accuracy and time complexity, our experiments reveal that for short-term air pollution prediction ES performs better than ARIMA and LSTM.
An End-to-End Guide on Time Series Forecasting Using FbProphet
This article was published as a part of the Data Science Blogathon. This article will implement time series forecasting using the Prophet library in python. The prophet is a package that facilitates the simple implementation of time series analysis. Implementing time series forecasting can be complicated depending on the model we use. Many approaches are available for time series forecasting, for example, ARIMA ( Auto-Regressive Integrated Moving Average), Auto-Regressive Model, Exponential Smoothing, and deep learning-based models like LSTM ( long short term memory).
12 Useful Algorithms for 12 Days of Christmas
TF-IDF stands for Term Frequency -Inverse Document Frequency, and it is used to determine how important a word is a document in a corpus (a collection of documents). Specifically, the TD-IDF value for a given word increases relative to the number of times a word appears in the document and decreases by the number of documents in the corpus that also contain that particular word. This is to account for words that are used more commonly in general. TF-IDF is a popular technique in the field of Natural Language Processing (NLP) and information retrieval. The Apriori Algorithm is an association rule algorithm and is most commonly used to determine groups of items that are most closely associated with each other in an itemset.
Long Short-term Memory RNN
Vennerød, Christian Bakke, Kjærran, Adrian, Bugge, Erling Stray
This paper is based on a machine learning project at the Norwegian University of Science and Technology, fall 2020. The project was initiated with a literature review on the latest developments within time-series forecasting methods in the scientific community over the past five years. The paper summarizes the essential aspects of this research. Furthermore, in this paper, we introduce an LSTM cell's architecture, and explain how different components go together to alter the cell's memory and predict the output. Also, the paper provides the necessary formulas and foundations to calculate a forward iteration through an LSTM. Then, the paper refers to some practical applications and research that emphasize the strength and weaknesses of LSTMs, shown within the time-series domain and the natural language processing (NLP) domain. Finally, alternative statistical methods for time series predictions are highlighted, where the paper outline ARIMA and exponential smoothing. Nevertheless, as LSTMs can be viewed as a complex architecture, the paper assumes that the reader has some knowledge of essential machine learning aspects, such as the multi-layer perceptron, activation functions, overfitting, backpropagation, bias, over- and underfitting, and more.
Using Exponential Smoothing in Algorithmic Trading.
First, let us start by the usual Stochastic Oscillator before proceeding with the Stochastic Smoothing Oscillator. An overbought level is an area where the market is perceived to be extremely bullish and is bound to consolidate. An oversold level is an area where market is perceived to be extremely bearish and is bound to bounce. Hence, the Stochastic Oscillator is a contrarian indicator that seeks to signal reactions of extreme movements.
Holt-Winters Forecasting for Dummies (or Developers) - Part I - Gregory Trubetskoy
This three part write up [Part II Part III] is my attempt at a down-to-earth explanation (and Python code) of the Holt-Winters method for those of us who while hypothetically might be quite good at math, still try to avoid it at every opportunity. I had to dive into this subject while tinkering on tgres (which features a Golang implementation). And having found it somewhat complex (and yet so brilliantly simple), figured that it'd be good to share this knowledge, and in the process, to hopefully solidify it in my head as well. Triple Exponential Smoothing, also known as the Holt-Winters method, is one of the many methods or algorithms that can be used to forecast data points in a series, provided that the series is "seasonal", i.e. repetitive over some period. In 1957 an MIT and University of Chicago graduate, professor Charles C Holt (1921-2010) was working at CMU (then known as CIT) on forecasting trends in production, inventories and labor force.